Microsoft Word - 19. OK_Revised [RegDone-3-4_305]_Mapping Parallel English _11-03_ CR-S-R
نویسندگان
چکیده
In this paper, we present a methodology for one to one (1:1) mapping of parallel English-Hindi parallel sentences. This methodology is based on the development of parallel English-Hindi word dictionary after syntactically and semantically analysis of the English-Hindi source text. We are using this methodology for the English and Hindi sentences, but the methodology can also be used for other languages. As big parallel corpus of English-Hindi pair language is not usually available, we design and develop two strategies to overcome this problem: normalization of tagged English sentences and Hindi sentences, on the one hand; mapping English-Hindi sentence using parallel English-Hindi word dictionary, on the other. Fortunately, this task, word alignment is well known, and some aligning algorithms are freely available.
منابع مشابه
Supporting Large English-Hindi Parallel Corpus using Word Alignment
This paper gives description about methodology to understand parallel English-Hindi sentences using word alignment. This methodology is foundation to develop the parallel EnglishHindi word dictionary after syntactically and semantically analysis of the English-Hindi source text. Methodology of proposed system is used for the English and Hindi sentences; also the methodology can be used for othe...
متن کاملUsing Word Alignment to Extend Multilingual Medical Terminologies
Medical terminologies such as those provided in the UMLS are never exhaustive and there is a constant need to enrich them, especially in terms of multilinguality. We present a methodology to acquire new French translations of English medical terms based on word alignment in a parallel corpus — i.e. pairing of corresponding words. We automatically collected a 27.7-million-word parallel, English-...
متن کاملAn Evaluation Exercise for Word Alignment
This paper presents the task definition, resources, participating systems, and comparative results for the shared task on word alignment, which was organized as part of the HLT/NAACL 2003 Workshop on Building and Using Parallel Texts. The shared task included Romanian-English and English-French sub-tasks, and drew the participation of seven teams from around the world. 1 Defining a Word Alignme...
متن کاملCreating Arabic-English Parallel Word-Aligned Treebank Corpora at LDC
This contribution describes an Arabic-English parallel word aligned treebank corpus from the Linguistic Data Consortium that is currently under production. Herein we primarily focus on efforts required to assemble the package and instructions for using it. It was crucial that word alignment be performed on tokens produced during treebanking to ensure cohesion and greater utility of the corpus. ...
متن کاملAnnotation Guidelines for Czech-English Word Alignment
We report on our experience with manual alignment of Czech and English parallel corpus text. We applied existing guidelines for English and French (Melamed 1998) and augmented them to cover systematically occurring cases in our corpus. We describe the main extensions covered in our guidelines and provide examples. We evaluated both intraand inter-annotator agreement and obtained very good resul...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2012